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Abstract 


Query optimization in database systems having non-parametric query metrics 
cannot be done using dynamic programming approaches. In this thesis, we propose 
an approach to solve the query optimization problem of complex non-parametric 
query metric 

B{p) = max(ao{p), ai{p), an{p)) 

by formulating Affine problems from it. This approach can be applied to other 
complex non parametric query metrics which are monotonous. 
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Chapter 1 
Introduction 


1.1 Query Optimization 

With the rapid strides in the computing and information technologies resulting in 
need for managing voluminous data, Database Management Systems have become 
a de facto standard for data processing and information retrieval systems. The 
Database Management Systems offers all the features for efficient, robust, fault- 
tolerant and user amenable data access by abstracting data as Objects, Relations or 
other hierarchal concepts. The Internet revolution and data-intensive applications 
like GIS, multimedia repositories has triggered the need for lowering the response 
time of users request to retrieve data. 

A request for data retrieval to a Database System is specified as an non-procedural 
SQL query. Originally. SQL was called SEQUEL (for Structured English QUEry 
Language) and was designed and implemented at IBM Research as the interface for 
an experimental relational database system called SYSTEM R. SQL is now the de 
facto standard Relational Database Language. A query in SQL specifies the rela- 
tions to be accessed, the attributes to be listed and the predicates which are to be 
satisfied by every tuple. 

For example, in Structured Query Language we can express the query to re- 
trieve the names of all the students of CSE department from the relation STU- 
DENT(NAME,ROLL,DEPT) as 
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Code to execute the query 

i 

RUNTIME DATABASE 
PROCESSOR 


Code can be: 

1 .Executed directly (interpreted mode) 

2. Stored and executed later whenever needed 
(compiled mode, embedded) 


Result of query 

Figure 1.1: Steps of processing a query in high-level language. 


Select Name 
From STUDENT 
Where DEPT = "CSE" 

The SQL query specifies only the predicates which must be satisfied by tuples se- 
lected from the relations given. The SQL query keeps away the burden of specifying 
the access path (primary or secondary) to be used, the join methods (Nested loop, 
Sort-Merge or Hash join) or the join order to be applied for processing the query. 
The overall strategy of processing the query given in non-procedural languages like 
SQL or embedded in high level languages is shown in the Figure 1.1. 
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A query expressed in a high level query language must be first scanned, parsed 
and validated [NAEL94]. The scanner identifies the language tokens in the query 
and the parser checks whether the query is conforming to the syntax rules of the 
query language. The query is then validated, by checking whether all the relations 
are existent or attributes are valid and expression is semantically meaningful. The 
query is then represented internally, usually as a tree or a graph data structure, 
which is also called a query tree or query graph. Now an execution strategy for the 
query graph must be established by the database specifying the access path to be 
selected, the join technique and the join order. 

The data about the data in the database, metadata is stored in the System 
Catalogs. The system catalogs store the schemas or descriptions of the database 
that the DBMS maintains. It describes the conceptual database schema, the internal 
schema, any external schemas and the mapping between the schemas at different 
levels. From the catalogs the information regarding the access paths available, size 
of relations, selectivity of predicates can be derived. The information in the catalogs 
are periodically updated by some module of the backend database engine. 

A query usually has multiple execution strategies and the technique of choosing 
the best one with respect to execution cost from feasible set of plans is termed as 
query optimization [KS86]. The query optimizer module produces an execution 
plan, and the code generator generated the code to execute that plan. The runtime 
database processor generated the query result by running the query code, whether 
compiled or interpreted. 

The above setup provides for a large room of options to the query optimizer to 
choose an execution plan. In case of joins or where the cardinality of the relations 
is very large, the cardinality of intermediate relations can vary substantially, hence 
the execution and response times. In data intensive applications or in real time 
informations sj'stems, where response time is at premium, applying optimization 
techniques to query graph can substantially improve the performance of the system. 

All query optimization techniques can be categorized either as cost-based opti- 
mizers or rule-based optimizers. The cost based optimizers base their decisions on 
some cost models. 
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1.2 Cost based and Rule based optimizers 


The Rule based optimizers use a rigid set rules to determine an execution plan 
for any SQL statement. The rule description languages are used to specify the 
general algebraic transformations or execution strategy for a given type of query. 
The current state of the database has no effect on the execution plans. The decisions 
are taken statically, a given query would always be transformed to the same query 
execution plan. 

The cost based optimizers base their execution plan selection decision on some 
cost models and choose the plan among the feasible plans having the minimum cost. 
The cost of executing a query can be modeled in terms of accesses to secondary 
storage, or response time to fetch the request data etc. The cost of executing a 
query includes the following components- 

• Access cost to secondary storage. 

• Storage cost. 

• Computation costs. 

• Communication costs. 

In large databases, minimizing the access cost to secondary storage is needed. 
While in smaller databases, where most of the data involved can be kept in main 
memory the main concern is minimizing the response time. On the other hand in 
case of distributed databases where multiple sites are involved, the communication 
costs also have a substantial impact. The current state of the database is obtained 
from the catalogs and execution plan having the minimum cost as per present state 
of database is chosen. 


1.3 Parametric Query Optimization 

Cost-based optimizers optimize queries based on the some cost models. The cost of a 
query execution plan depends on many parameters like available memory, cardinality 
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of base relations, size of intermediate results, selectivities of predicates, available 
access paths, processor speeds, communication links, disk latencies etc. The query 
optimizers compile a query into best execution plans assuming that all the parameter 
values are known at compile time. Practically the parameter values change between 
the compile time and runtime, having substantial impact on costs. Also in case of 
embedded SQL constructs, the predicates may have unspecified variables which are 
known only at the run time. Since the execution environment and database system is 
constantly changing, statically optimized query plan at compile time might turn out 
to be suboptimal at runtime. This result was shown by Graefe and Ward[GrW89]. 
A typical scenario of this would be a web based status enquiry in Railway reservation 
system. The user may try to enquire the current status of a ticket (PNRNO = x). 

Select * 

From RAILRES 
Where PNRNO = x 

Depending on the value of the variable x, the optimal plan may choose one of 
the accesjs path from either sequential scan or index on PNRNO. 

The problem of parametric query optimization is computing the different optimal 
plans considering the cost effecting parameters. Typically a single plan would not be 
optimal for all values of the parameters. A plan would be optimal only for a subset 
of values called region of optimality of the plan. One naive approach would be to 
find the plan optimal for each value of parameter, but this would be computationally 
exhaustive. The parametric query optimization problem tries to compile a query in 
to a set of plans called parametric optimal set of plans, each optimal for some range 
of parameter values. 

1.4 Non- Geometric and Geometric Approaches 

Various non-geometric approaches for this problem have been designed by lonnidis 
etal., [INS+92], Cole and Graefe [CG94], Antoshenkov [Ant93], Sumit Ganguly 
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and Krishnamurthy [GK94] , and Sumit Ganguly [Ganguly98]. [INS+92] follows a 
randomized approach. The Cole and Graefe [CG94] technique is based on partial 
ordering of costs of different plans. These techniques also generate a number of 
suboptimal plans. And above all these non-geometric techniques does not find the 
range of parameter values where a plan is optimal. 

The geometric approach for parametric query optimization problem called Iso- 
point method was introduced in [Ganguly98]. The algorithm for Linear one param- 
eter cost metric is in [Anjali99]. The algorithms discussed in the paper Ganguly98 
has been extended to ternary linear cost functions and binary non-linear cost func- 
tions in [Prasad99]. The geometric approaches for n parameter linear and non-linear 
cost functions have been successively solved in [SGPrUmAn2001]. The problem of 
parametric query optimization for linear and non-linear parameterized cost equa- 
tions has been solved. [SGPrUmAn2001] gives a algorithm for the linear 

parametric query optimization problem. 

Some non-geometric approaches for parametric query optimization for two differ- 
ent classes of query graphs, namely linear and star query graphs are in [SVUMRao99]. 

1.5 PQO problem 

In this section let us define the Parametric Query optimization problem. The defi- 
nition is same as given in [Ganguly98]. 

Let Si,S 2 ,. . .,s„ denote n parameters, where each Si quantifies some cost param- 
eter, such as selectivity, table sizes, available memory etc. In order to take into 
account the possible variation of these parameters, a logical possibility is to compile 
a query into a set of optimal plans called the parametric optimal set of plans. For 
every legal value of the parameters Si,. . .,s„, there is a plan in the parametric opti- 
mal set that is optimal for that value and vice-versa. The region of optimality for a 
plan p is defined by the set- 

{(si, . . . , I p is optimal at (si, . . . , s„)} 

The problem of parametric query optimization is to find the parametric optimal set 
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of plans and the region of optimality for each parametric optimal plan. 

The parametrized cost equation can be afSne or include nonlinear terms, and 
number of parameters can be n. The PQO problem had been solved for both the 
linear and nonlinear case as well. 

1.5.1 Affine problems 

This section gives the definition of Affine problems and some properties of it. These 
have been discussed in [SGPrUmAn2001]. 

Definition 1: A PQO problem is said to be afSne if the following three condi- 
tions hold. 

1. The feasible space is a convex polyhedron of R^. 

2. For X = {xi, X 2 , ■ ■ ., Xm) € F, the cost function C(p, x) is an afl&ne function 
of X, that is, C(p, x) has the form 

C(p, x) = oi(p) + a 2 ip)xi + a3{p)x2 + . . . + a„+i(p)xn 

3. There exists a function optimize (x) that returns the set of plans that are 
(equally) optimal at a point x G F. 

Conditions 1 and 2 in Definition 1 are “structural” properties. The third condition 
is about the existence of a computable (and hopefully easy) procedure to solve the 
non-parametric problem. 

We now discuss some simple and useful properties of affine PQO problems. 

1.5.2 Properties of affine PQO problems 

In this section, we state some properties of affine PQO problems. These properties 
have been quoted from [SGPrUmAn2001]. 
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• Property 1. Convexity of Regions of Optimality for Affine PQO . 

This property states that the region of optimality of a plan p, R(p) is a convex 
polyhedron. The convexity theorem is given in [Ganguly98]. 

• Property 2. An n-parameter affine cost function restricted to a k dimensional 
hyperplane in F is affine in k parameters. 


1.5.3 Examples: Affine problems 

Here we discuss some examples of some query optimization problems which are 
Affine in nature. More information about these is in [SGPrUmAn2001]. 

• Single unknown selectivity 

Let QR be a query having a single predicate, and the selectivity of predicate 
is unknown. Assuming uniform distribution of data, using the cost mod- 
els( [SAC+79]), the parameterized cost equation of QR can be expressed as 
C{p, s) = Go(p) +Ci 2 (p) ■ s. Let s be normalized such that 0 < s < 1. The cost 
is linearly dependent on s and the problem can be expressed as affine PQO. 

QR; Select * 

From STUDENT 
where HALL = 4 

• Load Balancing in Distributed Databases 

In a distributed query processing environment, let ai(p) denote the resource 
consumption by plan p at site or processori, having n processors or sites. With 
each site(or processor) a load factor li is attributed. The effective resource 
consumption can be modeled as 


C{p, li, I2, ■ . ■ , In) = ai • li -h a 2 • I2 -h ■ ■ ■ -h an ' In-l ^ li ^ OO, 1 < i < Tl. 
This can be expressed a a union of n Affine PQO problems. 
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• Optimal Affine Approximation for Parallel and Distributed Databases 

In a distributed query processing environment let ai(p) denote the resource 
consumption by the plan p at site i or processor i as the case may be. The 
cost metric is:- 

B{p) = max{ao{p), ai(p), . . . , a„(p)) 

These cannot be solved by applying dynamic programming approaches and 
are difficult to optimize. 


1.6 Motivation 

The current computing trends, made it necessary to avail the techniques for consid- 
erable reductions in the response time of query execution along with fault-tolerance. 
The parallel and Distributed databases are gradually establishing as the best al- 
ternative for data-intensive and real time applications. In parallel and distributed 
databases, the cost models applicable to query execution plans are highly complex. 
They are not parameterized and do not satisfy the principle of optimality required for 
applying dynamic programming methodologies. A nice approach would be to extend 
Parametric query optimization techniques for optimizing complex query metrics. 

1.7 Our Contribution 

In this thesis we design algorithms to solve the problem of optimizing the complex 
query metrics using PQO techniques. The overall approach is geometric one. The 
query cost metric used is: 

B(p) = max(ao(p),ai{p), . . . , a„(p)). 

We have also extended our algorithm for n parameters query metric case. 
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1.8 Organization of the Thesis 


Chapter 2 gives the detailed explanation of the algorithm designed by us for three 
parameter metric. First the properties of the cost coordinate space along with 
parameter space are discussed. 

Chapter 3 extends the algorithm to n parameter complex metric. 

Chapter 4 discusses some conclusions we arrived at and looks at future research 
work. 
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Chapter 2 

Algorithm for 3 parameters 


This chapter discusses the basic idea of how to solve the optimization problem for 
complex query metrics. Optimization of queries in distributed and parallel envi- 
ronment yields complex non-parametric cost metrics. These complex query metrics 
are computationally intensive and cannot be solved using dynamic progr ammin g 
approaches. An approach for solving these is to use Parametric Query Optimization 
(PQO) which gives us Best Parametric approximates. We shall discuss our algorithm 
for an example complex query metric. 


2.1 Example cost metric 

In a distributed query processing environment let Oi(p) denote the resource con- 
sumption by the plan p at site i or processor i as the case may be. The cost metric 
is:- 

B{p) = maa:(ao(p), ai(p), • • . , a„(p)) 

The above cost metric is difficult to optimize since it is not parametric and 
solution techniques like dynamic programming are inadequate. So one intelligent 
approach is to apply PQO techniques by framing an appropriate Parametric Query 
optimization problem, which would give us the best approximate. 
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2.2 Overview of Approach 


The PQO approach for best approximate of conventional non-parametric query op- 
timization problems has the following two logical steps: 

• Formulating a problem as a best affine approximation 

• Computing the best affine approximation 

2.2.1 Formulating a problem as a best affine approximation 

Consider the query metric for the distributed query processing environment. The 
cost metric is defined by B(p) = moa;”_Qai(p), where ai(p) denotes the resource 
consumption at some site i. We first formulate the PQO problem, represented by 
PQODD. 

PQODD : C(p,x) = Eai(p)xi, 1 < Xi < oo, 0 < i < n. 

Here xojXi,. . .,Xn are the artificial parameters. The PQODD problem is not affine 
and so we construct n-l-1 affine PQO problems out of it as follows. 

PQODD^ : C(p, s) = ai{p) + E'^ifa^ip) • sj, Sj = ^; 0 < Si < 1 

Xi 

Now find the parametric optimal plan in the parameter space which is having the 
minimum value of B(p) metric. This is the best affine approximation of the plan 
that is optimal with respect to the metric B. The best approximate can be efficiently 
computed without generating the Parametric Optimal Set (POS) of plans. 

Consider the case when B(p) = max(ao(p)iai(p)). The following are the two 
afl&ne PQO problems: 


PQODDq : x) = ao(p) -1- ai(p) • a: 0 < a: < 1 

PQODDx : a;) = ai(p) -1- ao(p) • a; 0 < a: < 1 

In the next section we will discuss the algorithm for best affine approximation which 

has been designed by Sumit Ganguly [SGPrUmAn2001]. 
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2.2.2 Computing the best affine approximation 

We have to find the parametric optimal plan having the minimum value of B metric. 
The EQUILINE refers to the line ao(p) = ai(p) = ... = o„(p) in the n dimen- 
sional cost coordinate space. The basic idea is, first we check whether the EQUILINE 
cuts the convex hull CH or not. We optimize at x=0 and x=l. 

If (ai(p) > a 2 (p) I p = optimize (0)) or (ai(q) < a 2 (q) | q = optimize(l)) 
then the EQUILINE does not pass through the convex hull, otherwise the EQUILINE 
passes through the convex hull. 

In the earlier case when the EQUILINE does not pass through the convex hull 
the required plan is at x=0 or x=l. This is shown in Figure 2.1. 

In the later case we have to descend the hull and move to the base of it. This is 
done recursively using the below algorithm. 

procedure Initialize () 

Output; An interval [ 1, u ] and plans p, q such that p = Mins (optimize (1)) and 

q = MinB(optunize(u)). 

p ;= Mins (opt imize(randoin( 0 ,l)) 

if(ai(p) = 02 (p)) return [x,x,p,p] 

if(ai(p) < a2(p)) return [x,l,p,MmB(optimize(l))] 

if(ai(p) > a 2 (p)) return [ 0 ,x,MmB(optimize( 0 )) ,p] 

procedure AffineApprox(l,u,p,q) 

Input: 0 < 1 < u < 1 such that p is optimal at 1 and q is optimal at u. 

Output: The best affine approx w.r.t. the B metric in the interval [l,u]. 
if(l=u) return MmB(p,q). 

Let [l,x] = ClipRegionCp. [l,u] ,q) 

r = MiriB (optimize (x) ) ; if r is either p or q return r. 

if( ai(r) > 02 (r) ) return Aff ineApprox(l,x,p,r) 

if( ai(r) < 02 (r) ) return Aff ineApprox(x,u,r,q) 

if ( ai(r) = 02 (r) ) return r. 


1 .^ 



< “iV 

=■ ''iV 


(p) 

Figure 2.1: Algorithm for 2 parameter metric. 

2.3 Properties of parameter space 

A Database SQL query given in a non-procedural set oriented representation can be 
executed using different plans. Each plan is expressed by a query graph. The cost 
equation corresponding to each feasible plan for a given query can be represented by 
a point in the cost coordinate space. Hence a parameterized plan p of n parameters, 
corresponds to a point A(p) in the n+1 dimensional cost coordinate space. 

Cost coordinate space CCS = (ao(p),ai(p),. . .,an(p)) V p is a feasible plan. 
Convex Hull A Convex Hull of a set of points is a boundary formed by the 
points such that all points in the set lie either inside or on the boundary and a 
segment jo ining any pair of points lies completely inside the polygon. 

2.3.1 Convex hull 

It has been shown by Sumit Ganguly in [Ganguly98] that the convex hull of the 
set of points in the cost coordinate space corresponding to the set of feasible plans, 
defines the parametric optimal set of those plans. That means all the plans lying on 
the convex hull represent the parametric optimal set POS of plans. 
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The complete convex hull represents the Parametric optimal set of plans when 
the parameters are assumed to be normalized (0 < |x| < 1). When the parameter 
value bounds is 0 < x < 1, the parametric optimal set corresponds to only one h alf 
of the complete convex hull. 

Two plans p and q lying on the convex hull which are adjacent to each other, are 
connected by the line having direction cosines A(p) - A(q). This line corresponds 
to the isocost plane of p and q in the parameter space each lying on either side of it. 
Thus two plans adjacent on the convex hull are neighbors in the parameter space. 

All the plans which are adjacent to a plan p on the hull are neighbors of p in the 
parameter space, and try to bound, the region of optimality R(p) on all directions 
from R(p). as shown in Figure 2.2. 


jLi-DAlil 

R(r) ^ R(p) ^ R(q) 

0 a b~ i 

s 

Parameter space Fj 

Figure 2.2: Neighborhood in parameter space and Adjacency on hull. 

The region of optimality R(p) of a plan p is represented in the cost coordinate 
space by the range of the direction cosines of the normal to any line I that can 
be drawn passing through A(p), without intersecting any of the adjacent plans 
represented by the A(q). If the parameter space is 0 < x < 1, then the region of 
optimality R(p) includes positive parameter values, hence the direction cosines of 
normal of 1 cannot be negative. That means the normal vector of every face of the 
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convex hull has positive direction cosines. 

The convex hull corresponding to the positive parameter space is that half of the 
complete convex hull that lies below l-x+m-y+n- z+- +k-w = 1 plane towards the origin, 
where l,m,. . .,k > 0 

The EQUILINE intersects the complete convex hull at two distinct points due 
to convexity of the complete convex hull. 

The EQUILINE mteisects the convex hull corresponding to positive parameter 
space either at one point or does not intersect at all. 

The EQUILINE caxinot intersect the convex hull at two points, that is the EQUI- 
LINE cannot pierce through our hull. 



Figure 2.3; Intersection of EQUILINE with hull 


Proof Let fi and /2 be the two faces of the hull which the EQUILINE pierces. 
This is explained in Figure 2.3. That means the EQUILINE should make acute 
angle with inward normal vector of one face and obtuse angle with the inward 
normal vector of other face. 

Let ni and 712 be the normal vectors of the two faces, deq be the direction cosines 
of EQUILINE. 

• Angle is obtuse implies h ■ dgq < 0. 
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• Angle is acute implies h ■ dig > 0. 


^ = nx + liy + . . , + Uy, 

Since the direction cosines of the normal to a face of the hull are i>ositive, n ■ 4 
can never be less than zero. Hence the angle between normal of face ana EQ UlLlNE 

cannot be obtuse. That is the EQUILINE cannot intersect the convex hull at two 
points. 



Figure 2.4: Boundary of hull and parameter space. 


2.3.2 Boundary Theorem 

The plans lying on the boundary Bch of the hull represents the pararr^ ^ ^ric optimal 
set of plans on the boundary Bp of the parameter space F. 

Proof The plans lying on the lower part of the convex hull adjacently 

bounded by some neighboring plans on all directions. That is in parn^a^eter space 
we cannot boundlessly increase the parameter value along some direct and still 
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remain inside R(p). While the plans lying on the boundary Bch of the convex hull 
are not bounded by neighbors on all directions. So we can increase the parameter 
value boundlessly along any such direction d from s, and (s+d) belongs to R(p). 
Thus the plans lying on the boundary Bch of the hull represents the parametric 
optimal set of plans on the boundary Bp of the parameter space F. This is depicted 
in the Figure 2.4. 

2.3.3 Base corollary 

If the EQUILINE does not intersect the convex hull, the base of the hull must also 
be a part of the boundary of the hull, and hence must lie on the boundary of the 
parameter space. 

Let p be any plan on the convex hull in (ao(p),ai(p)) near the EQUILINE such 
that ao(p) > ai(p) and |ao(p) - ai(p)| = (5(a very small quantity). Any plan Q 
that is neighbor to P cannot lie in the rectangle region OAPB or infinite square 
region PB'IA' in the cost coordinate space, where O = (0,0), A = (ao(p),0), P = 
(oo(p),ai(p)) and B = (0,ai(p)) and B' = (oo,ai(p)),yl' = (ao(p),oo) and I = (oo,oo). 
This is shown in the Figure 2.5. 

Let El and E 2 be two adjacent edges of the convex hull on which plan p lies. 
Edge cannot intersect the rectangle OAPB because of convexity of the hull. If the 
EQ UILINE does not cut the convex hull, then one of the edge must lie in the region 
PB'IA'. But this is not possible because normal to any edge of convex hull cannot 
make an obtuse angle with the EQUILINE. Hence such a edge cannot exist. Thus 
such a plan p cannot have a adjacent neighbor on all directions if the hull is not 
cut by EQUILINE. The same applies when ao(p) < ai(p). Thus plan p lies on the 
boundary of the convex hull, hence optimal along the boundary of the parameter 
space. 

The above thesis can be applied for a higher dimensional case too. Let p be any 
plan on the convex hull in the (oo(p),oi(p),a 2 (p)) cost coordinate space near the 
EQUILINE such that ao(p) = max(ai(p)). Any plan q that is neighbor to p must 
not lie in the 3 dimensional cube having the diagonal OP, 0 is origin and P = A(p). 
The normal to any face fi on which plan p lies must not make an obtuse angle with 
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Figure 2.5: Base of Hull lies on the boundary of the hull. 

the EQUILINE. That is any neighbor plan cannot lie in the 3 dimensional cube in 
cost coordinate space having AI as diagonal where A = A(p), I = (00,00,00). 

Thus q should either approach EQUILINE and cross it or go away from it. If 
the EQUILINE does not intersect the convex hull then q the neighbor of p must go 
away from EQUILINE. Thus p has no neighbor along such a plane. Thus p lies on 
the boundary of the convex hull. 

Hence if the EQUILINE does not intersect the convex hull the base of the hull 
must lie on the boundary of the hull, hence on the boundary of the parameter space. 

2.3.4 Neighbor Corollary 

Consider the convex hull of parametric optimal plans in the cost coordinate space 
(ao(p),ai(p),a2(p)) The parameter space is Fx,y 0 < x,y < 1. Any plan p which is 
neighbor to three plans Pi,P2,P3 must lie embedded between R(pi),R(p2) R(P3) in 
the parameter space. The plan p which is optimal at the isocost point Siso must be 
neighbor of the three plans. The plan p which is neighbor to the three plans must 
lie behind the plane formed by pi,P2 and pz on the convex hull in the cost coordinate 
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space. 

Thus any plan p optimal inside the polyhedron formed by the points Si,S 2 ,. . .,Sn 
must lie behind the plane formed by plans optimal at these points in the n dimensional 
cost coordinate space. 


2.4 Approach for 3 parameter metric 

In this section we will discuss the approach for applying PQO when the query cost 
metric is of the form: 

B(p) = max{ao{p), ai(p), a^ip)) 

PQODD : C{p, x) = ao(p) ■ a:o + Oi(p) • xi + a 2 {p) ■ X 2 , 0 < Xo,xi, X 2 < oo 

The above PQODD can be expressed as 3 affine PQO problems given below:- 

PQODD ° : C^°^p, s) = aoip) + ai{p) • si + a 2 (p) • S 2 0 < si, sj < 1 

PQODD^ : s) = a,{p) + ao(p) • + a2{p) • S2 0 < Si,S2 < 1 

PQODD '^ : C^'^\p, s) = a 2 (p) + ao(p) • Si + ai(p) • S 2 0 < Si, S 2 < 1 

The best affine approximation to PQODD is the better of the best affine approx- 

imations to PQODD^, PQODD^ and PQODD^ respectively. We shall explain the 
general approach for finding the best affine approximation of PQODD^. 

In this case the convex hull is a surface in 3 dimensional cost coordinate space. 
The required plan lies on the base of this convex hull. The basic idea is to find 3 plans 
on the convex hull such that the plane formed by these three plans is intersected 
by the EQUILINB. The neighbor corollary says that any plan neighbor to three 
plans lies behind the plane formed by these three plans in the cost coordinate space. 
Also this neighbor plan has lower B(p) than atleast one of the three plans. Since 
normal to any face of even the partial hull cannot form an obtuse angle with the 
EQUILINE. 

The region of optimality of this plan lies embedded within the regions of optimal- 
ity of these three plans in the parameter space Fx,y, so isocost point of these three 
plans in parameter space is a good choice to constrict the search space in parameter 
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space. Hence once we find such a triple of plans we can descend to the base of the 
convex hull using the isocost point of these three plans to constrict the search region 
in parameter space. 

The algorithm first checks if such a triple of plans can be found among the plans 
optimal at the corners of the parameter space. If such a triple cannot be found out 
from the corners, we search the edges of parameter space for a plan using which 
an ideal triple can be constructed. This search of edges is further made efficient by 
searching only the prospective edge. 

The three plans and the neighbor of these three plans p form a tetrahedron (a 
convex polyhedron) in the 3 dimensional cost coordinate space. The EQUILINE WiW 
either intersect the tetrahedron in 2 pointsii.e., 2 faces) or will not intersect at all. 
Thus the descend hull would always yield a plane intersected by the EQUILINE. The 
algorithm recursively descends until no new plan is found out. There are boundary 
cases which can be appropriately handled. 

The selection of prospective edge is done when the four plans optimal at the four 
corners cannot form a plane through which the EQUILINE passes. The prospective 
edge gives us the parameter space over which the partial convex hull be developed 
so that the EQUILINE passes through it. This is found using the intersection of 
EQUILINE with the plane of the three corner plans. 

The basic idea is that if the EQUILINE cuts the convex hull then we have to 
descend to the base of the hull, otherwise the base is also part of the boundary of the 
hull, the minimum B(p) plan lies on the base of the hull. 

2.5 Detailed description of Algorithm 

This section gives the detailed description of the overall strategy and explains each 
of the algorithms used. 

2.5.1 Algorithm Main 

1. Optimize at the four corners -C0,A,B,C} 

/* pmset = {pi,P2,P3,P4}, Pdistinct C pmset.*/ 

^0 A . 
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2. if ( |Pdistinct| ~~ 1) 

{return the unique plein.} 
else 

{if ( IPdistinct I 3) 

{ edge = getprospective(p,q,r) ; 

Pdistinct ~ Pdistinct getsomeplan(edge) ; 

/* make no of distinct plans to be 3, search on prospective edge.*/ 

} 

} 

/* To start with atleast 3 distinct plans are need to frame a plane */ 

3. if (inside (Pdistinct) 

{ Pdistinct = do_boundary_checks() ; 

/* Let fi,f 2 be the two faces through which EQUILINE passes. Select the fi 
which has the plan having minimum B(p) */ 

PdMinct = face having j5"^(minimum(B(pi) ,B(p 2 ) ,B(p 3 ) ,B(p 4 ))) 
goto step 4 

} 

else 

{ /* Intersecting plane cannot be framed from corner plans */ 
temp = getintplan(pdistmct) ; 

/* getintplan will search for a plan (to frame such a plane) on the prospective 
edge.*/ 

if (inside(temp,p,q,r[,t]) == False) return search_on_boundary(temp) ; 
/* inside (temp, p,q,r[,t])= False => The EQUILINE does not intersect the 
convex hull at all */ 

} 

4. return Descend_Hull 

I Algorithm do _boundary_ checks 

This function does the checking for boundary cases. It checks for the following case: 
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• Check if {INTi,INT2] n {pi^p^Ma} # Return INT if one of the vertex 
lies on the EQUILINE. 

• Rest all other condition checks are taken care of in Descend_Hull algorithm 

For the case 1 it returns the INTi or INT2. 

I Description Main 

In step 1 we optimize at the four corners of the parameter space. Let pmset be the 
multiset of plans Pi,P2,P3 and p4, optimal at the corners and let Pdistinct be the set 
of distinct plans among these. Since a plane in three dimensional cost coordinate 
space is uniquely defined by three distinct plans, we check the cardinality of the 

Pdistinct Set. 

If the cardinality of the set Pdistinct is one or in other words only a single plan is 
optimal at the four corners, then by the property of convexity of optimality region 
only one plan is optimal on the whole parameter space. For uniquely defining a 
plane requires three plans, so if the cardinality of set Pdistinct is less than three 
we search for some more plans on the prospective edge. A prospective edge is the 
boundary edge in the parameter space, on which the prospects of finding a plan of 
our interest are bright. 

The EQUILINE intersects the tetrahedron either at two points or does not inter- 
sect at all. The inside function returns true if the EQUILINE intersects in Apqr for 
any three plans p,q,r. We also perform boundary conditions check to deal with the 
case the EQUILINE touches the tetrahedron. If the EQUILINE ideally intersects 
the tetrahedron in two faces, the face(s) which have the plan having minimum B(p), 
that is B”^(minimum(B(pi),B(p2))B(P3))B(P4))) is selected. 

If the tetrahedron is not intersected by the EQ UILINE we search for a possibility 
by developing the partial Hull along the prospective direction or in other words 
search on prospective edge of parameter space. Procedure getintplan does the above 
task. If getintplan returns a plan which does not form an intersecting plane it implies 
that whole of convex hull is not intersected by the EQ UILINE. Since the boundary 
of the convex hull represents the parametric optimal plans along the boundary of 
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the parameter space we search the boundary. 

If we could construct a triple of three distinct plans through which the EQUI- 
LINE passes we descend to the base of the convex hull using the De$cend_Hull 
algorithm. 

2.5.2 Search_on_boundary 

Input: Phase 

The getintplan when returns unsuccessfully, returns a plan in the base of the 
convex hull. Since in this case the convex hull lies on one side of the EQUILINE 
and so the boundary of the convex hull also includes the base. That is the base of 
convex hull is also present on the boundary of the parameter space. So search for 
the best plan in term of B(p) metric in the neighborhood of Pbase on the appropriate 
boundary edge. 

2.5.3 Algorithm getprospective 

Input p,q,r 

Output set of tuples ■Cpi,P2}‘ 
result = 

ss = nearest ({p,q,r}) ; 

/* {p,q} e ss */ 
for (each -Cp.q} € ss) 

{ if({p,q} forms a real edge of OABC) 
result = result U -Cp,q}-; 

} 

return result; 

I Description getprospective 

This function returns the prospective edge defined by the plans optimal at the 
corners. The definition of prospective edge is in Figure 2.6. It finds the nearest edge 
to the EQUILINE and if it corresponds to some real edge on the parameter space it 
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Figure 2.6: Defining Prospective Edge. 

returns that edge. If the nearest edge is not real edge then it returns null. Any edge 
is real if it corresponds to some existing boundary edge of the parameter space. 

2.5.4 Algorithm getintplan 

Input edgeset, All_corners 
Output Pint 

prosedgeset = getprospective (edgeset) 
forCeach pros G prosedgeset) 

■C 

temp.s = JsocostedseiCpi^os.p.pros.q) ; 

/* pros.p and pros.q are plans optimal at the corners of the edge i */ 
temp. plan = opt imize (temp. s) ; 
if (temp. plan G {pros .p, pros .q}) 

{ remove (prosedgeset .pros) 

prosedgeset = prosedgeset U edge (temp. plan, nplan) 

/* where nplan G pros and nplan ^ temp. plan) */ 
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continue } 

if (inside (ss, temp. plan) | ss C All.corners and |ss| = 2 ) 

{ return temp. plan } 

tempedgeset = the two sub edges partitioned at temp.s 
remove (prosedgeset .pros) 

prosedgeset = prosedgeset U getprospective (tempedgeset) 

} 

return temp. plan 
I Description getintplan 

The getintplan searches for an interesting plan on a prospective edge of the parameter 
space. Once it finds out some prospective edge it optimizes at the isocost point of the 
two plans optimal at the end points of the edge. This is a non recursive procedure 
and terminates only if no new plan is found and the prospective parameter space 
is completely traversed. It adopts a divide and conquer strategy. Suppose p,q are 
the optimal plans at the end points of the prospective edge, and say piso is the 
optimal plan at the isocost point of the two plans p,q. We make a decision of 
conquering which half of the prospective edge. If the convex hull lies to one side of 
the EQUILINE the function getintplan returns the plan which is at the base as it 
recurses down. 

I Proof for getintplan algorithm 

By traversing the prospective edge we can get a plan to frame a plane intersected by 
the EQUILINE, which is of interest to us. 

Proof When the partial hull that has been developed is not intersected by the 
EQUILINE, then we need to further develop the partial hull between the two plans 
that are spatially nearer to the EQUILINE. That is the prospective direction and 
developing the hull along such direction would be favorable to frame a plane inter- 
sected by the EQUILINE. The prospective edge is that edge of the parameter space 
boundary, corresponding to such a set of plans which are optimal at the ends of the 
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edge. 

If the EQUILINE does not intersect the convex hull at all, that means the hull 
lies to one side of the EQUILINE, then we cannot frame such a plane at all. The 
divide and conquer approach will in that case would descend to the base of the hull 
which also lies on the boundary of the hull, that is on the prospective edge of the 
boundary of the parameter space. 

2.5.5 Algorithm Descend_Hull 

Input p,q.r,t,Si, 52,53,54; 

Output plan; 
if(t + $) 

{ 

/* To select the three plans intersected by EQUILINE out of two intersecting faces 
formed by corners */ 
temp.s =54; 
temp. plan = t; 

} 

else 

{ 

temp.s = ISOCOST (p, q, r) ; 
temp. plan = optimize (temp. s) ; 

} 

while (not (temp. plan e {p,q,r}) 

{ 

/*Loops till some new plan is discovered* J 

-Cp,q,r} = process_descend(p, q,r, temp. plan, 5i ,52, 53, temp.s) ; 
if (|■Cp,q,r>| == 0) return main (rectangle (51 ,52,53)) ; 

/*If the EQUILINE lies in the plane of a face or touches the tetrahedron call main 
at that point new boundary of the hull*! 
temp.s = ISOCOST (p, q, r) ; 
temp. plan = opt imize (temp. s) ; 
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} 

/^Return the plan having minimum value of B(p) at the base*/ 

return best(p,q,r); 

I Algorithm process_descend 

Input p , q , r jPisocost > > S2 > ^3 > ^isocost 

Output -Cp,q,r} 

UNTi,INT2y = intersection (p, q, r, Pwocost •EQUILINE) . 
if(.INT2 e {p,q,r,pi5ocost}) return INT2-, 

/* INTi is any vertex return that INTi */ 
ifC/ATTs = $) 

{ /* The EQUILINE touches tetrahedron, generate set of new appropriate plans */ 
return ; 

} 

ifCINTi € Apqr and INT2 € Apqr) { 
return 

/* The EQUILINE lies in the plane of a face of tetrahedron */ 

} 

if (INTi e face INT2 € face 2 ) 

{ return face 2 ; } 
else 
{ 

: -Cp,q, optimize (isocost (p,Pi 5 oco«t.r))} 

A2 : {p,r, optimize (isocost (pjPisocost.q))} 

if (inside (Ai)) return Ai; 
if (inside (A2)) return A2; 
else return {p, q, isocost (p,q,r)} 

/*This case is handled subsequently, a EQUILINE touching the face is returned*/ 

} 
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I Description Descend_Hull 

The Descend_Hull algorithm takes 3 distinct plans as input, such that the triangular 
plane through them in cost coordinate space is intersected by EQUILINE. This is 
depicted in Figure 2.7. Three points {si,S 2 ;S 3 } in the parameter space Fx,y are also 
taken such that each 

Si G R(pl) I pi G {P)q,r}. Any plan Pisocost that is optimal at the isocost of these 
three plans must lie on the convex hull behind the plane formed by them in cost 
coordinate. 

/ 


^ isocost(p,q,r) 



\ 

\ 

\ 


Eouium 

Figure 2.7: Descend Hull Algorithm. 

If Pisocost lies on the EQUILINE then Pisocost is the required plan. Otherwise it is 
guaranteed that B(pisocost) < max(B(p),B(q),B(r)), when EQUILINE intersects the 
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Apqr 

{PiQi^iPisocost} forms a tetrahedron(a convex polyhedron). EQUILINE intersects 
the tetrahedron at 2 points {INTiJNT^}. {INTi,INT2] n {p,q,r} = # is guaran- 
teed. Let INTx be lying on the plane of {p,q,r}, INTi can lie either on edges or on 
triangular face Apqr. 

1 . INTi lies inside on the face of Apqr 

• INT2 ^ # 

Pisocost lies on EQUILINE implies INT2 = Pisocost, then Pisocost is required 
one. Otherwise INT2 can lie on any edge or adjacent face. 

If INT2 lies on a face then we replace the plan .B“^(max(B(p),B(q),B(’^)) 
by Pisocost which in the resultant face is guaranteed to be intersected by 
EQUILINE. 

If INT2 lies on the say (p,Pwocost) edge find the optimize(isocost(p,p^socos^^q)) 

(or optimize (isocost (p, Pisocost, r))) and do above checks considering the 
tetrahedron {p,optimize(isocost(p,pisocost,i')) 5 q 5 r} or {p,optimize(isocost(p,pisocos 

2 . INTx lies on the edge of Apqr. 

• INT2 ^ $ 

Pisocost lies on EQUILINE implies INT2 = Pisocost, then Pisocost is required 
one. Otherwise INT2 can lie on any edge or adjacent face. 

If INT2 lies on a face then we replace the plan B“^(max(B(p),B(q),B(r)) 
by Pisocost which in resultant face is guaranteed to be intersected by EQUI- 
LINE. 

If INT2 lies on the say (p, Pisocost) edge find the optimize(isocost(p,pisocost,q)) 

(or optimize(isocost(p, Pisocost, r))) and do above checks considering the 
tetrahedron {p,optimize(isocost(p,pisocost,r)),q,r} or {p,optimize(isocost(p, pisocos 

• INT2 = $ 

Consider the rectangular parameter space OxAxBxCx formed by {si, 52,53} 
in the parameter space, 
return main(Oi,Ai,Bi,C'i) 
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3. Both INTi and INT2 lie on the same triangular face of Apqr. 

Consider the rectangular parameter space OiAiBiCi formed by {51,52,53} in 
the parameter space, 
return main(Oi,Ai,jBi,C'i) 

I Proof for Descend_Hull algorithm 

The Descend_Hull algorithm tries to frame a plane in a n dimensional cost coordi- 
nate space through which the EQUILINE passes and tries to descend to the base of 
the hull. 

Proof Suppose p,q and r be three plans in 3 dimensional cost coordinate space 
(ao(p),0i(p),02(p)). Any plan Pneigh that is neighbor to the three plans must lie 
behind the plane formed by p,q,r, towards the origin. Plans p,q,r,Pneigh form a 
tetrahedron(a convex polyhedron), which is the partial convex hull. Since none 
pf the face can make an obtuse angle with the EQUILINE, Pneigh should lie behind 
the plane. In case Pneigh lies above the plane formed by p,q,r then the other three 
faces of the tetrahedron will be making an obtuse angle. 

Any plan that is neighbor to the three plans must lie embedded within the 
region of optimality of the three plans in the parameter space. Any straight line 
that intersects a convex polyhedron must intersect exactly at two points (faces) or 
does not intersect at all, taking care that the line does not touch the polyhedron (a 
boundary case with one point of intersection). 

Thus descend hull algorithm safely descends to the base of the hull. 

2.5.6 Algorithm nearest 

Input: p,q,r; 

Output: result 
PL = plane(p,q,r); 

INT =mteTsection{EQUILINE,PL); 
for (each ss C {p,q,r}, |ss| = 2) 

{ dv = direction vector of line formed by ss 
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if((dprod — (dv-INT)-(dv-w)) > 0, w € {p,q,r} and w ^ ss) 
result = result U ss 

} 

return result; 

I Algorithm inside 

Input: p,q,r; 

Output: True or False 

result = nearest(p,q,r); 

if( I result] == 3) return True; else False; 

■ Description nearest and inside 

The function nearest finds the edge which is nearest to the EQ UILINE of the three 
edges of triangle formed by three plans. The function finds the intersection point 
of the EQUILINE with the plane formed by the three plans, say INT. Let dv be 
the direction cosines of edge A(q)A(r). If product of dot products of dv with 
coordinates of A(p) and dot product of dv with INT is negative than both p and 
INT lie on opposite sides and this edge A(q) A(r) is the nearest edge. 

The function finds whether INT is inside the Apqr. It calls nearest for each 
two vertex subset of {p,q,r}, and if each vertex and INT lies on the same side with 
respect to other two vertices (other edge) then the cardinality or number of tuples 
returned by nearest would be zero. Hence INT lie inside Apqr. 

2.5.7 Analysis of Boundary conditions 

Consider the case when we are having three distinct plans forming a tetrahedron 
in the cost coordinate space. Let us assume that the EQUILINE intersects the 
tetrahedron. The following cases arise: 

1. The EQUILINE does not intersect or touch the tetrahedron. 

Find the prospective edge and getintplan on it. 
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2 . The EQUILINE touches the tetrahedron. 

Here only one point of intersection INTi is there. INT2 = $. INTi can be a 
vertex or and edge. If INTi is vertex then return INTi, otherwise 

3 . The EQUILINE intersects the tetrahedron at 2 points. 

Here the following are the possibilities: 

(a) INTi is vertex 

i. INT2 is vertex return INTi 

ii. INT2 is face return INTi 

iii. INT2 is edge return INTi 

(b) INTi is face 

i. INT2 is vertex return INT2 

ii. INT2 is face return the face containing the plan 5 “’^(min(B(p),B(q)jB(r),B(t))) 

iii. INT2 is edge 

search for a new plan to replace in place of t (or optimize(iso(p,q,r))). 

The new plan can be t^w = optimize(iso(p,q,t)) or tnnew = opti- 
mize(iso(p,r,t)) and check is intersecting plane be constructed using 
plane(p,q,t„ne«;) or plane(p,r,t„e«;))- 

(c) INTi is edge 

i. INT2 is vertex return INT2 

ii. INT2 is face 

search for a new plan to replace in place of t (or optimize(iso(p,q,r))). 

The new plan can be tnew = optimize(iso(p,q,t)) or tnnew = opti- 
mize(iso(p,r,t)) and check is intersecting plane be constructed using 
plane(p,q,tnneu)) or plane(p,r,t„eu;))- 

iii. INT2 is coplanar edge Both INTi and INT2 lie in the same plane 
ie; A pqr 

Consider the rectangular parameter space OiAiBiCi formed by {si,S2)S3} 
in the parameter space, 
return main(Oi,^i,Bi,C'i) 
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iv. INT 2 is non coplanar edge 

search for a new plan to replace in place of t (or optiniize(iso(p,q,r))). 
The new plan can be tnew = optimize(iso(p,q,t)) or tnnew = opti- 
mize(iso(p,r,t)) and check is intersecting plane be constructed using 
plane(p,q,t„„eti;) or plane(p,r,t„et«))- 

2.5.8 Miscellenious Algorithms 

The following are the miscellaneous functions which have been also used. 

1. optimize finds the plan(s) optimal at any point s in the parameter space. 

2. Isocost finds the isocost point Sisocost of a set of plans. 

3. rectangle returns the rectangular parameter space formed by 3 points Si,S 2 ,S 3 
in the parameter space. 

4. intersection finds the intersection of EQUILINE with the plane formed by 3 
plans in cost coordinate space plane(A(p),A(q) A(r))- 

5. getsomeplan finds out some plan on the given edge of parameter space. 
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Chapter 3 

Algorithm for n parameters 


This chapter discusses the techniques for extending the algorithm described in pre- 
vious chapter to n dimensional complex query metric. The baseline of the whole 
approach remains the same. First we discuss the extensions to the basic approach 
to solve for 4 parameter query metric. Then we give a generalized algorithm for n 
parameter query metric. 

3.1 Extending to 4 parameter metric 

Consider the following query metric in 4 parameters. 

B{p) ■= max{ao{p),ai{p),a2{p), {p) ) 

PQODD : C{p,-x) = ao{p)-Xo+ai{p)-Xi+a2{p)-X2+a3{p)-X3, 0 < xo,Xi,X2,X3 < oo 
The four affine parameterized PQO problems are of the following type: 

PQODD^ : C^'^ip, X, y, z) = ao{p) + ai{p) • x + 02(p) • y + 03 (p) -2:, 0 <x,y,z <1 

Parameter space is F(x,y,z), where 0 < x,y,z < 1 
The cost coordinate space is A(p) =(ao(p),oi(p),a2(p),a3(p)). The parameter space 
is a cube in three dimensions. The cube is bounded by six two-dimensional faces 
{/ij/zr ■ -j/e}- We have eight corner points CORNERS = {0,Ci,C2,. . .jCr}. 

EQUILINE : ao(p) = ai(p) = a2(p) = az{p) 
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Th6 cost coordinate space is four dimensional space . The convex hull in the 
four dimensional cost coordinate space represents the parametric optimal set of 
plans POS. 

We need four distinct plans to define a three dimensional plane in the cost 
coordinate space. We try to frame a three dimensional plane in cost coordinate 
space, through which the EQUILINE passes. If such a plane can be framed by 
a subset of four plans optimal at the corner points we proceed to Descend_Hull 
algorithm, otherwise we extend the partial convex hull in the prospective direction, 
by searching for an interesting plan on the prospective face /pros- 

The four plans and the neighbor of these four plans p, form a polyhedron (convex) 
in four dimensions, bounded by three dimensional faces. The EQUILINE will either 
intersect the polyhedron at two points (convexity of polyhedron) or does not intersect 
it at all. Hence the descend hull would always yield a face (three dimensional plane) 
having lower B(p) value out of all the vertices of the polyhedron. Also more than 
two plans cannot be collinear in the cost coordinate space, since we are choosing the 
plans to be lying on the convex hull. 

3.1.1 Computing the best affine approximation 

This algorithm also consist of the following main sub algorithms. 

• Algorithm main 

• Algorithm Descend_Hull 

• Algorithm process_ descend 

• Algorithm getprospective 

• Algorithm getintplan 

• Algorithm nearest 

The overall approach is the same, only the notion and method of finding the 
prospective face has slight modifications. 



3.1.2 Algorithm Main 


1. Optimize at the corner points {0,Ci,C2,. . ..Ct} 

/* pmset = { pi,p 2 r . .,ps} be the multiset of the eight plans.* I 
!* Let Pdistinct C pmsst which are all distinct.*/ 

2. if(| Pdistinct | ~~ 1) 

-(return the unique plan.} 
else 

■Cif(| Pdistinct \ ■'C 4) 

i face = getprospective (pmset) ; 

Pdistinct ~ Pdistinct U getsomeplan(f ace) ; 

/* make no of distinct plans to be J^,by searching on prospective face.*/ 

} 

} 

3. if (inside (.Pdistinct') # ^) 

/* a S dimensional intersecting plane can be framed using corners */ 
{Pdistinct - do_boundary_checks() ; 

/* Let facey,face 2 be the two S dimensional faces through which EQUILINE 
passes. Select the face^ which has the minimum B(p) plan */ 
select B~^ (minimum (B(pi) ,B(p 2 ) . .,B(p8))) 
goto step 4 
} 

else 

{ /* Intersecting plane cannot be framed from corner plans */ 
temp = getintplanO ; 

/* getintplan will search for a plan (to frame such a plane) on the prospective 
face.*/ 

if (inside (temp, Pdistinct) " False) return search_on_boundary(temp) ; 
/* inside (temp, Pdistinct)— False => The EQUILINE does not intersect the con- 
vex hull at all */ 

} 
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4. return Descend_Hull. 


I Description Main 

In step 1 we optimize at the eight corner points of the parameter space. Let pmset 

be the multiset of plans PuP2,Pz,. . .,P 8 , optimal at the corners and let Pdistinct be 
the set of distinct plans among these. 

We check the cardinality of the Pdistinct set. If the cardinality of the set Pdistinct 
is one, then we return the only plan as in 3 parameters case. And if the cardinality 
of set Pdistinct IS loss than four we search for some more plans along the prospective 
direction, that is on the prospective face fp^os of the parameter space. A prospective 
face is the bounding face of the parameter space, on which the prospects of finding 
a plan of our interest are better. 

The EQUILINE v/il\ intersect the 4 dimensional polyhedron either at two points 
or does not intersect at all. We also perform boundary conditions check to deal with 
the caae the EQUILINE tonches the polyhedron. If the EQUILINE ideally intersects 
the polyhedron in two faces, the face(s) which have the plan having minimum B(p), 
that is i?“*(minimum(B(pi),B(p2))B(p3),B(p4),B(p5))) is selected. 

If the polyhedron is not intersected by the EQUILINE we search for a possi- 
bility by developing the partial Hull along the prospective direction by searching on 
prospective face in parameter space. Procedure getintplan does the above task. If 
getintplan returns a plan which does not form an intersecting plane it implies that 
whole of convex hull is not intersected by the EQUILINE. Since the boundary of 
the convex hull represents the parametric optimal plans along the boundary of the 
parameter space we search the boundary. 

If we could construct a quadruple of four distinct plans through which the EQ UI- 
LINE passes we descend to the base of the convex hull using the Descend_Hull 
algorithm. 
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I Algorithm do_boundary_checks 

The polyhedron formed by plans in p^istinct is convex and hence EQUILINE vfill ei- 
ther intersect at one point INTi (boundary conditions touch) or two points INTi,I NT 2 
or does not intersect at all the polyhedron (pdistinct) • 

This function does the checking for boundary cases. It checks for the following 
case: 

• Check if {INTi,INT 2 } fi Pdistinct 7^ *5- 

• Rest all other condition checks are taken care of in Descend_Hull algorithm 
For the case 1 it returns the INTi or INT 2 . 

3.1.3 Algorithm getprospective 

Input Pdistinct /*Set of distinct plans */ 

Output set of tuples -Cp.q.r.t} 
result = tresult = <l>; 

For (each face fi of the parameter space) 

{ 

plans! = plans (/i); 

/* plansi is multiset of plans optimal at the comers of the face fi */ 
for (each ss C plansi | |ss|=3) 

{ 

pother = a distinct plan belonging to a corner not on this face, 
stemp == nearest (ss, pother) ; 

/* {PiQjr} e stemp */ 

for (each {p,q,r} € stemp ) 

{ if ({p,q,r} forms a adjacent comers of this face 
{ tresult = tresult U -Cp,q,r};} 

} 

} 

for (each fi) 
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{ if (each triple adjacent corners are nearest) 
result = result U /ij 

} 

return result; 



Figure 3.1: Prospective face in 4 parameter metric. 


I Description getprospective 

This function returns the prospective face of the parameter space on which the 
prospects of finding the plan of interest are brighter. For each face of the parameter 
space, we consider a subset of 3 adjacent plans on the face and a plan pother not 
belonging to this face. If the nearest face of the tetrahedron of these four plans 
corresponds to some real face, then add the three adjacent plans to the result set 
tresult. If all such three adjacent triple of plans of a face are real, then the particular 
face of the parameter space is the prospective face. 

3.1.4 Algorithm getint plan 

Input FaceSet, All.corners 
Output Pint 

prosfaceset = getprospective (Faceset) 
for (each pros G prosfaceset) 
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{ 

temp.s = Jsocosi/i (pros .plans) ; 

/* pros. plans = {pros. pi, pros. p2, pros. pS,pros.p4} */ 
temp. plan = optimize(temp.s) ; 

if (inside (ss, temp, plan) |(for some ss C All_corners and |ssi == 3 ) 
{return temp. plan } 

tempprosset = the four subfaces partitioned at temp.s 
remove {pros/ aceset, pros) 

prosfaceset = prosfaceset U getprospective (tempprosset) 

} 

return temp. plan 
I Description getintplan 

The getintplan first finds the prospective face /j. Find the isocost point of the four 
plans at the corners. Let Pisocost he the plan optimal at the isocost point. If a 
intersecting plan can be formed using the plan Pisocost return the plan Pisocost other 
wise find subface out of the four subface which is prospective. Recurse till the 
parameter space on the face is complete. Repeat this divide and conquer strategy 
for each prospective face belonging to prosfaceset. 

3.1.5 Algorithm Descend_Hull 

Input p,q,r,t,u,Si,S2.53,S4,S5; 

Output plan; 
if(u ^ $) 

{ temp.s = Sg; temp. plan = u; } 
else 

{ temp.s = IS 0 C 0 ST(p,q,r,t) ; temp. plan = opt imize (temp. s) ;> 
while (not (temp. plan G {p,q,r,t}) 

{ 

{p,q,r,t} = process_descend(p,q,r,t, temp. plan, Si ,52,53,54, temp.s) ; 
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if (|{p,q,r,t}| == 0) return main(cube(si ,32,53 ,54)) ; 
temp.s = IS0C0ST(p,q,r ,t) ; 
temp. plan = opt imize (temp. s) ; 

} 

return best (p,q,r,t) ; 

3.1.6 Algorithm process _ descend 

Input p , q, r , t ,Pisocost > , S2 , S3 , S4 , s isocost 

Output ■{p,q,r,t} 

UNTi,INT2'} = intersection(p,q,r,t,pisocost .EQUILINE) . 
if(/7VT2 G -Cp,q,r,t,pi5oco5t}) return INT2; 

/* INTi is any vertex return that INTi */ 
ifC/ATTa = $) 

{ /* EQUILINE touches polyhedron, generate set of new appropriate plans */ 
return 
} 

if (.INTi € tetra(pqrt) and INT2 € tetra(pqrt)) { 
return 

/* The EQUILINE lies in the tetrahedral plane of a face of polyhedron */ 

} 

if (INTi G face INT2 G face2) 

{ return face2; } 
else 

return $ . 

} 

3.1.7 Algorithm nearest 

Input; p,q,r,t; 

/* Four plans forming a 3 dimensional plane */ 

Output: result 
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PL = plane(p,q,r,t) ; 

/* PL is 3 dimensional plane */ 

INT =intersection(EQUILINE,PL) ; 
forCeach ss C -Cp.q.r.t}, |ss| = 3 ) 

{ dv = direction vector of 2 dimensional plane formed by ss 
if((dprod =(dv-INT)-(dv-w)) > 0, w € {p,q,r,t} and w ^ ss) 
result = result U ss 

} 

return result; 

3.1.8 Algorithm inside 

Input: p,q,r,t; 

Output: True or False 

result = nearest (p,q,r,t) ; 

if ((result I == 4) return True; else False; 

I Description nearest and inside 

The function nearest finds the face which is nearest to the EQUILINE of the four 
faces of tetrahedron formed by four plans. The function finds the intersection point 
of the EQUILINE with the 3 dimensional plane formed by the four plans, say INT. 
Let dv be the direction cosines of 2 dimensional face A(q)A(r)A(t). If product of 
dot products of dv with coordinates of A(p) and dot product of dv with INT is 
negative than both p and INT lie on opposite sides and this face A(q) A(r) A(t) is 
the nearest face. 

The function finds whether INT is inside the tetrahedron pqrt. It calls nearest 
for each three vertex subset of {p,q,r,t}, and if each vertex and INT lies on the same 
side with respect to other three vertices(other face) then the cardinality or number 
of tuples returned by nearest would be 4-. Hence INT lie inside tetrahedron pqrt. 
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3.1.9 Miscellaneous Algorithms 

The following are the miscellaneous functions which have been also used. Optimize 
finds the plan(s) optimal at any point s in the parameter space. Isocost finds the 
isocost point Sjsocost of a set of plans. Cube returns the cube of parameter space 
formed by 4 points Si, 52 , 83,54 in the parameter space. Intersection finds the inter- 
section of EQUILINE with the plane formed by 4 plans in cost coordinate space 
plane(A(p),A(q),A(r),A(t)). getsomeplan finds out some plan on the given edge of 
parameter space. 

3.2 Extending to n+1 parameter metric 

The above approach which has been extended for 4 parameter case, can now easily 
be made to work for any n-parametric cost equation. Consider the following query 
metric in n-l -1 parameters. 

B{p) = max{ao ip) , ai (p ) , a 2 (p) , . . . , On(p)) 

PQODD : C(p,x) = ao{pyxo+ai{p)-xi+a 2 {p)-X 2 +. . .+an{p)-Xn, 0 < a:o,xi, 0 : 2 , . • . ,a:„ < 00 
The n-l-1 affine parameterized PQO problems are of the following type: 

PQODD' : C^'\p, 5i, 52, ... , 5n) = ap(p)+ai(p)-Si+a2(p)-52-l-. . .+a„(p)-s„, 0 < Si, S 2 , • • • , < 

The cost coordinate space is A(p) =(ao(p),ai(p),a 2 (p),. . .,o„(p)). 

EQUILINE : ao(p) = ai(p) = . . . = an{p) 

The parameter space consists of a hypercube in n dimensions. This hypercube is 
bounded by 2n (n-l)-dimensional faces {/i,/ 2 ,- • •,/ 2 n}- We have 2” corner points 
CORNERS = {0,Ci,C2,. . .jCjfc}, k = 2" - 1. The cost coordinate space is n-fl 
dimensional space . The convex hull in the n+1 dimensional cost coordinate space 
represents the parametric optimal set of plans POS. 

We need n+1 distinct plans to define a n dimensional hyperplane in the n+1 cost 
coordinate space. We try to frame a n dimensional hyperplane in cost coordinate 
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space, through which the EQUILINE passes. If such a plane can be framed by a 
subset of plans optimal at the corner points we proceed to Descend_Hull otherwise 
we extend the partial convex hull in the prospective direction, by searching for an 
interesting plan on the prospective face fpros- 

The n+1 plans and the neighbor of these n+l plans p, form a polyhedron (convex) 
in n+l dimensions, bounded by n dimensional faces. Similarly, the EQUILINE will 
either intersect the polyhedron at 2 points (convexity of polyhedron) or does not 
intersect it at all. Hence the descend hull would always yield a face(n dimensional 
plane) having lower B(p) value out of all the vertices of the polyhedron. 

3.2.1 Computing the best affine approximation 

In step 1 we optimize at the corner points of the parameter space. Let pmset be the 
multiset of plans pi,P 2 ,P 3 ,- ■ ;Pk, k = 2" optimal at the corners and let Pdistinct be 
the set of distinct plans among these. We check the cardinality of the Pdistinct set. 

If the cardinality of the set Pdistinct is one, then we return the only plan as in four 
parameters case. And if the cardinality of set Pdistinct is less than n+l we search for 
some more plans along the prospective direction, that is on the prospective face /pros 
of the parameter space. A prospective face is the bounding face of the parameter 
space, on which the prospects of finding a plan of our interest are better. 

The EQUILINE will intersect the n+l dimensional polyhedron either at two 
points or does not intersect at all. The boundary conditions can also be suitably han- 
dled. If the EQUILINE ideally intersects the polyhedron in two faces, the face hav- 
ing the minimum B(p) plan, that is B”^(minimum(B(pi),B(p 2 ))B(p 3 ),. . .,B(pn+i))) 
is selected. 

If the polyhedron is not intersected by the EQUILINE we search for a possi- 
bility by developing the partial Hull along the prospective direction by searching on 
prospective face in parameter space. If getintplan returns a plan which does not 
form an n dimensional intersecting plane it implies that whole of convex hull is not 
intersected by the EQUILINE. Since the boundary of the convex hull represents the 
parametric optimal plans along the boundary of the parameter space we search the 
boundary. 
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If we could construct a (n+1) tuple of n+1 distinct plans through which the 
EQUILINE passes we descend to the base of the convex hull using the Descend_Hull 
algorithm. 

3.3 Computational Analysis 

The total computational overhead can be analyzed as below. In the whole approach, 
the time required for the optimize function is at premium and must be reduced. In 
step 1 of main algorithm, optimizing at the corners of the parameter space, in the 
worst case makes 2" calls of optimize function for n dimensional parameter space 
and n+1 parameter metric. 

The Descend_Hull algorithm is the other function which also make calls to op- 
timize function. The number of calls to optimize function is same as the number of 
iterations of the Descend_Hull loop. 

All other procedures are basically mathematical computations done on convex 
hull in the cost coordinate space. The getprospective declares any face fi of the 
parameter space as prospective only if each n+1 subset of the plans optimal at the 
corners of the face fi, the nearest plane is the real one. It doesn’t consider further 
processing for a face fj once one of its subset has non real plane as nearest one. This 
also gives us lot of savings. 

The decision that EQUILINE is not intersecting the hull too gives us savings 
by returning some plan on the base of the hull for this case. By optimizing in the 
neighborhood of the base plan we can get the least B(p) plan, as base C Boundary. 

Thus the above approach has good performance. 
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Chapter 4 

Conclusions and Future Work 


This section concludes by summarizing the work done in this thesis. We also suggest 
some future directions for research in the area of parametric query optimization. 

4.1 Conclusions and Summary 

Algorithms have been developed for solving the query optimization problem when 
the query metrics are non parametric in nature. These complex query metrics occur 
in parallel and distributed query processing environments. The very nature of query 
metric does not allow us to apply dynamic programming approaches. 

We tried to solve the query optimization problem of such non-parametric query 
metrics, by expressing a set of n Affine problems, and applying PQO techniques to 
it. 

We basically build our approach on the result that convex hull of plans in cost 
coordinate space represents the Parametric optimal set of plan [Ganguly98]. We 
proved that the boundary of the convex hull represents the parametric optimal set 
of plans along the boundary of the parameter space. 

The algorithm has been developed for the cost metric 

B{p) = max(ao{p),ai{p), a„(p)) 

The algorithm has been generalized to work for n parameter cost metrics. The 
above approaches can be used for other monotonous query metric like 
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B{p) = $(ao(p), ai(^j), . . . , a„(p)) 


Every monotonous function decreases along the convex hull from the boundary 
of the hull towards the base of the hull. The EQUILINE in the above case has the 
unique property that its intersection with the convex hull in cost coordinate space 
gives the best approximate of minimum B(p) plan. The descend to the base of the 
hull is done in the parameter space by descend_hull algorithm. 

For any monotonous cost metric, we just need to find such a curve like EQUI- 
LINE, DESCLINE to descend to the base of the hull. 

4.2 Directions for Future work 

Parametric Query Optimization is relatively an unexplored area having large scope 
of problems to be solved. In this thesis we tried to apply PQO techniques to one 
type of non-parametric query metric like B{p) = max{ao{p), ai(p), . . . , a„(p)). Lots 
of issues still remain untouched. 

• One immediate avenue is to apply PQO to solve query metrics like 

B{p) = a§(p),4(p),...,o*(p)) 
which occur in parallel query processing environments. 

• Developing some approaches to solve for non strictly monotonous query metrics 
is also a nice theoretical work. 

• Looking the above and other existing algorithms from implementation point 
of view is a nice work. 

• Developing a parametric query optimizer for distributed databases and parallel 
databases is also a good work. 
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